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Abstract 

JHDL  is  a  design  tool  for  reconfigurable  systems  that 
allows  designers  to  express  circuit  organizations  that  dy¬ 
namically  change  over  time  in  a  natural  way,  using 
only  standard  programming  abstractions  found  in  object- 
oriented  languages.  JHDL  manages  FPGA  resources  in 
a  manner  that  is  similar  to  the  way  object-oriented  lan¬ 
guages  manage  memory:  circuits  are  treated  as  distinct 
objects  and  a  circuit  is  configured  onto  a  configurable  com¬ 
puting  machine  (CCM)  by  invoking  its  constructor,  effec¬ 
tively  “constructing”  an  instance  of  the  circuit  onto  the  re¬ 
configurable  platform  just  as  object  instances  are  allocated 
in  memory  with  conventional  object-oriented  languages. 
This  approach  of  using  object  constructors/destructors  to 
control  the  circuit  lifetime  on  a  CCM  is  a  powerful  tech¬ 
nique  that  naturally  leads  to  a  dual  simulation/execution 
environment  where  a  designer  can  easily  switch  between 
either  software  simulation  or  hardware  execution  on  a 
CCM  with  a  single  application  description.  Moreover, 
JHDL  supports  dual  hardware/software  execution;  parts 
of  the  application  described  using  JHDL  circuit  constructs 
can  be  executed  on  the  CCM  while  the  remainder  of  the  ap¬ 
plication  -the  GUI  for  example-  can  run  on  the  CCM  host. 
Based  on  an  existing  programming  language  (Java),  JHDL 
requires  no  language  extensions  and  can  be  used  with  any 
standard  Java  1.1  distribution. 


1  Introduction 

When  developing  applications  for  configurable  or 
FPGA-based  computing  machines  (CCM),  designers  must 
perform  two  general  tasks.  First,  they  must  design  the  cir¬ 
cuitry  that  implements  the  necessary  functionality  for  the 
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application.  This  is  typically  done  using  commercial  CAD 
tools  such  as  VHDL  synthesis  or  schematic  capture  in  con¬ 
cert  with  the  back-end  tools  obtained  from  the  FPGA  de¬ 
vice  vendors.  Second,  designers  must  write  a  supervisory 
program  that  controls  the  configurable-computing  platform 
during  the  operation  of  the  application.  In  some  cases  this 
control  program  is  relatively  simple,  just  loading  a  single 
configuration  and  then  loading  and  retrieving  data.  In  more 
complex  run-time  reconfigured  applications  for  example, 
these  control  programs  can  be  relatively  complex,  loading 
a  variety  of  configurations  and  data,  on  demand,  as  the  ap¬ 
plication  proceeds.  Currently,  the  control  program  and  the 
circuit  description  must  be  developed  and  simulated  inde¬ 
pendently;  the  designer  is  responsible  for  ensuring  that  the 
these  two  pieces  of  software  cooperate  correctly,  typically 
through  repeated  download,  execute  and  compile  cycles  on 
the  CCM. 

This  division  between  circuit  description  and  control 
program  is  really  just  a  division  of  the  application  into  its 
constituent  static  and  dynamic  parts:  the  static  part  repre¬ 
sented  by  a  circuit  library,  and  the  dynamic  part  embod¬ 
ied  in  a  control  program  that  chooses  circuit  configurations 
from  a  library,  configures  devices,  and  executes  the  appli¬ 
cation.  However,  given  that  dynamically  changing  hard¬ 
ware  is  at  the  core  of  configurable  computing,  treating  the 
dynamic  and  static  parts  of  the  application  independently 
is  awkward  and  limiting.  What  is  needed  is  a  single  inte¬ 
grated  description  that  allows  the  designer  to  naturally  ex¬ 
press  the  dynamic  and  static  parts  of  the  application  simul¬ 
taneously.  This  paper  describes  a  design  approach/CAD 
tool  that  focuses  on  the  creation  of  such  an  integrated  de¬ 
scription.  As  the  number  of  Hardware  Description  Lan¬ 
guages  (HDLs)  is  innumerable  and  as  it  is  difficult  and  time 
consuming  to  come  up  with  pithy  acronyms,  we  have  cho¬ 
sen  to  name  this  system,  JHDL,  for  Just  another  HDL. 


2  Project  Goals 

The  primary  objective  of  this  research  project  is  to  de¬ 
velop  a  tool-suite/design-environment  for  describing  cir- 
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cuit  organizations  that  dynamically  change  their  structure 
over  time.  This  project  has  the  following  additional  re¬ 
quirements  and  potential  benefits. 

1.  It  must  use  an  existing  programming  language  with 
no  extensions.  This  will  make  the  tool  accessible  to  a 
wider  range  of  programmers  by  allowing  them  to  use 
commercially-available  compilers. 

2.  The  CCM-control  paradigm  must  be  CCM  indepen¬ 
dent.  CCM  control  details  should  be  abstracted  to  a 
higher  level  programming  abstraction.  This  will  make 
CCMs  more  accessible  to  programmers  and  will  also 
ease  the  process  of  retargetting  an  applications  to  run 
on  a  variety  of  different  CCMs. 

3.  The  description  method  must  support  run-time  and 
partial  configuration.  These  are  the  most  demanding 
CCM  applications  and  will  be  used  from  the  outset  to 
stress  the  design  environment. 

4.  The  integrated  description  must  serve  for  both  simu¬ 
lation  and  final  execution  with  no  modifications.  For 
simulation,  it  must  support  end-to-end  simulation  of 
applications  that  may  consist  of  many  configurations. 
For  execution,  it  should  be  possible  to  switch  trans¬ 
parently  from  a  software  simulation  to  hardware  ex¬ 
ecution  on  the  CCM  simply  by  changing  a  software 
switch. 


3  Background 

There  have  been  several  efforts  to  create  textual  CAD 
tools  for  FPGA  designs.  In  an  early  pioneering  effort  at 
DEC  PRL,  Vuillemin  and  his  group  developed  and  used 
Perle  [9]  to  design  CCM  applications  on  DECPerle-1  and 
more  recently  on  the  Pamette  [7],  Perle  is  a  C++-based 
CAD  tool  that  uses  hierarchy  and  inheritance  to  describe 
user  circuits.  A  Perle  description,  when  compiled  and  ex¬ 
ecuted,  generates  a  netlist  that  is  then  processed  by  Xilinx 
place  and  route  tools.  Other  similar  examples  of  object- 
oriented  circuit-design  languages  include  Spyder  [4]  and 
Lola  [2], 

Run-time  reconfiguration  (RTR)  has  been  receiving 
more  attention  lately  and  a  few  efforts  are  starting  to  re¬ 
port  results  with  tools  and  run-time  environments.  Luk 
and  Shirazi  [5]  reported  on  compilation  tools  for  RTR  de¬ 
signs.  Their  tools  consist  of  a  partial  evaluator,  an  in¬ 
cremental  configuration  calculator  and  a  optimizer.  One 
of  their  goals  is  to  automatically  generate  circuit  overlays 
that  have  been  optimized  for  use  in  partially  configured 
applications.  Burns  and  Donlin  [1]  reported  on  a  run¬ 
time  system  for  dynamic  configuration.  They  proposed 


a  run-time  system  that  attempts  to  automatically  manage 
FPGA  resources  similar  to  the  way  a  conventional  OS  man¬ 
ages  memory  or  CPU  resources.  The  system  as  proposed 
consists  of  a  virtual  hardware  manager  (for  managing  the 
FPGA  resources),  a  transform  manager  (for  modifying  cir¬ 
cuits  to  accommodate  available  device  resources),  a  con¬ 
figuration  manager  (to  manage  the  configuration  process), 
and  a  device  driver.  Gokhale  and  Gomersoll  [3]  reported 
on  their  high-level  compilation  tools  for  fine-grained  FP- 
GAs  such  as  the  National  CLAy  device.  These  tools  accept 
a  dbC  (data-parallel  C)  version  of  the  algorithm,  partition 
it  into  control  and  datapath  and  then  implement  the  circuit 
using  parameterizable  module  generators  that  have  been 
optimized  for  fine-grained  FPGAs.  Lysaght  has  also  re¬ 
ported  on  a  VHDL-based  simulation  environment  for  RTR 
[6]. 

JHDL  has  some  things  in  common  with  many  of  these 
efforts.  First,  as  a  design  tool,  it  has  been  designed  to  di¬ 
rectly  support  run-time  reconfiguration,  both  partial  and 
global,  and  it  attempts  to  hide  details  of  configuration  from 
the  user.  However,  in  contrast  to  other  work,  JHDL  makes 
no  attempt  to  automatically  identify  partial  configurations 
nor  does  it  address  the  run-time  physical  transformation 
of  circuits  so  that  they  will  fit  within  available  FPGA  re¬ 
sources.  At  present,  JHDL  is  primarily  a  manual  design 
tool  that  combines  CCM  control  and  circuit  design  into 
single  integrated  description.  JHDL  probably  has  more  in 
common  with  Perle  as  it  uses  hierarchy  in  a  manner  sim¬ 
ilar  to  Perle.  However,  it  differs  from  Perle  in  that  it  was 
specifically  designed  to  support  run-time  reconfiguration 
and  CCM  control. 

Note  that  Java  is  not  critical  to  this  project;  almost  any 
object-oriented  language  would  have  sufficed.  Java  does 
have  some  useful  features  that  can  be  exploited  for  this 
project,  in  particular,  the  portability  and  integrated  GUI 
API  are  useful,  however,  any  language  that  supports  ob¬ 
ject  construction  and  hierarchy  would  be  a  likely  candidate 
for  this  project. 

4  Research  Approach 

The  primary  distinction  of  JHDL  and  indeed  the  pri¬ 
mary  goal  of  this  project  is  the  creation  of  a  single  inte¬ 
grated  API  that  allows  the  designer  to  express  circuit  or¬ 
ganizations  that  dynamically  change  over  time.  Stated  an¬ 
other  way,  the  primary  goal  is  to  allow  the  designer  to  spec¬ 
ify,  in  a  reasonably  natural  way,  when  hardware  gets  loaded 
and  removed  from  a  CCM  without  exposing  any  of  the  de¬ 
tails  normally  associated  with  CCM  operation.  Rather  than 
invent  a  new  language  feature  to  schedule  the  configura¬ 
tion  of  circuits,  we  chose  to  adopt  the  object-instance  con¬ 
struction/destruction  mechanism  used  in  object-oriented 


languages.  Conventional  object-oriented  languages  man¬ 
age  memory  through  object  constructors  and  destructors. 
Memory  is  allocated  by  invoking  an  object  constructor  that 
allocates  the  necessary  memory  from  the  heap  or  stack  and 
sets  object  variables  to  initial  values.  Memory  is  reclaimed 
by  invoking  an  object  destructor  that  frees  the  memory 
back  up  to  be  used  by  other  objects.  JHDL  manages  FPGA 
resources  on  CCMs  in  a  similar  manner.  In  JDHL,  all  cir¬ 
cuits  are  developed  hierarchically  as  distinct  objects.  Al¬ 
locating  FPGA  resources,  i.e.,  configuring  the  FPGA  de¬ 
vices,  is  performed  by  invoking  the  constructor  for  a  cir¬ 
cuit  object  and  analogously,  FPGA  circuitry  is  reclaimed 
by  invoking  the  circuit’s  destructor. 

This  approach  of  using  object  constructors/destructors 
to  control  the  circuit  lifetime  on  a  CCM  is  a  powerful  tech¬ 
nique  that  naturally  leads  to  a  dual  simulation/execution 
environment  where  a  designer  can  easily  switch  between 
either  software  simulation  or  hardware  execution  on  a 
CCM  with  a  single  application  description.  When  simu¬ 
lating  in  software,  the  constructors/destructors  communi¬ 
cate  with  the  JHDL  simulation  kernel.  Constructors  create 
object  instances  in  system  memory;  these  object  instances 
are  actually  simulation  models  that  interface  with  a  simu¬ 
lation  kernel  to  provide  a  clock-by-clock  simulation  of  the 
user  circuit.  However,  when  executing  in  hardware  (on  the 
CCM),  the  constructors/destructors  communicate  directly 
with  the  CCM  (through  a  JHDL  interface  layer)  instead  of 
the  simulation  kernel.  Instead  of  allocating  system  mem¬ 
ory,  constructors  load  circuit  descriptions  from  a  circuit  li¬ 
brary  and  control  the  execution  of  the  CCM.  Analogously, 
destructors  remove  circuits  by  replacing  existing  circuits 
with  “blank”  configurations,  similar  to  the  state  that  exists 
when  the  FPGA  is  initially  reset. 


5  Overview 

The  system  described  in  this  paper  is  very  much  an  ex¬ 
periment  in  progress.  In  these  early  stages,  the  main  goal 
was  to  demonstrate  feasibility  of  the  constructor/destructor 
mechanism  as  a  means  for  controlling  a  CCM;  feasibil¬ 
ity  of  object-oriented  languages  as  circuit  design  tools  has 
already  been  demonstrated  by  others.  In  its  current  state 
JHDL  implements  a  circuit  simulator  and  the  control  API 
for  the  Hotworks  board  from  Virtual  Computer  Corpora¬ 
tion  (VCC).  Netlisting  capability  is  supported  by  an  in¬ 
ternal  circuit  graph  data  structure  (hereafter  referred  to  as 
the  circuit  graph)  that  is  maintained  automatically  by  the 
JHDL  environment  as  circuits  are  constructed  or  destruc- 
ted;  however,  an  actual  netlist  format  has  not  been  deter¬ 
mined  at  this  point.  (Note  to  reviewers:  by  the  time  of 
FCCM,  primitive  netlisting  will  be  in  implemented.)  As 


such,  all  circuits  for  the  ATR  demonstration  system  dis¬ 
cussed  in  this  paper  (the  ATR  shapesum  circuits,  for  exam¬ 
ple)  were  designed  manually  using  schematic  capture  with 
their  matching  simulation  models  written  using  the  JHDL 
primitives.  In  addition,  because  this  application  uses  run¬ 
time  configuration  with  partial  configuration  of  the  6200 
devices  on  the  Hotworks  board,  most  circuits  were  hand 
placed.  Full  simulation  of  the  ATR  application  was  per¬ 
formed  in  JHDL  and  the  results  of  the  simulation  com¬ 
pletely  match  the  hardware  execution  on  the  Hotworks 
CCM.  Note  that  the  JHDL  circuit  description  serves  as  the 
sole  means  of  controlling  the  VCC  CCM,  controlling  all 
I/O  and  configuration  sequencing. 

6  The  JHDL  System 

The  current  JHDL  system  is  implemented  as  a  set  of 
Java  class  libraries  with  functionality  divided  into  two  ba¬ 
sic  areas:  circuit  simulation  and  CCM  runtime  support. 
Circuit  simulation  classes  allow  the  designer  to  design  cir¬ 
cuit  models  that  can  be  simulated  at  the  clock  level  through 
the  JHDL  simulation  kernel.  CCM  runtime  support  classes 
provide  transparent  access  to  CCM  control  functions  via 
the  construction/destruction  mechanism  described  earlier. 

Designers  develop  circuits  in  JHDL  by  selecting  from 
a  set  of  synchronous  and  combinational  elements  and 
wiring  these  together  to  form  any  arbitrary  synchronous 
circuit.  There  are  three  different  classes  that  can  be  used 
to  implement  a  circuit:  CL  (combinational).  Synchronous 
(clocked),  and  Structural  (interconnection  of  combina¬ 
tional  or  synchronous  elements).  When  creating  a  new  cir¬ 
cuit,  the  designer  decides  whether  the  outputs  of  the  circuit 
are  updated  continuously,  i.e.,  it  is  a  combinational  circuit 
(a  CL  object),  or  are  updated  only  on  a  clock  edge,  i.e.,  it 
is  a  synchronous  circuit  (a  Synchronous  object),  or  if  it  is 
a  structural  circuit  ( Structural  object),  i.e.,  one  that  is  just 
a  set  of  existing  synchronous  or  combinational  circuit  ele¬ 
ments  interconnected  together.  In  each  case,  the  designer 
defines  a  new  class  that  inherits  from  the  appropriate  class 
and  implements  the  desired  functionality  in  the  constructor 
and  other  methods.  Individual  circuits  are  interconnected 
by  instantiating  Wire  objects  and  passing  these  to  the  object 
as  arguments  to  the  object  constructors. 

Software  Simulation  under  JHDL 

The  actual  behavior  of  the  newly  defined  circuit  class 
is  specified  differently,  depending  upon  whether  it  is  a  CL 
or  Synchronous  object.  For  CL  objects,  the  designer  must 
write  a  propagated  method  that  will  generate  a  new  set 
of  outputs,  based  on  the  current  inputs,  each  time  it  is 
called.  This  propagated  method  is  automatically  called 


by  the  simulation  kernel  each  time  at  least  one  of  the  in¬ 
put  wires  connected  to  the  circuit  object  registers  a  change 
during  simulation.  For  Synchronous  objects,  the  designer 
writes  a  behavior( )  method  that  will  generate  a  new  set  of 
outputs  each  time  it  is  called.  The  behavior()  method  is 
automatically  invoked  each  time  a  new  clock  cycle  is  is¬ 
sued  by  the  simulation  kernel.  When  designers  are  imple¬ 
menting  structural  designs  (as  will  be  the  case  when  a  de¬ 
sign  library  and  netlisting  capability  is  available),  they  only 
need  to  derive  a  new  class  that  inherits  from  Structural ,  and 
write  a  constructor  that  will  wire  up  the  necessary  library 
elements  to  achieve  the  desired  function.  No  propagated 
or  behavior()  methods  are  written  for  Structural  circuits 
as  their  behavior  is  completely  derived  from  the  behaviors 
of  the  interconnected  constituent  subcircuits.  Any  circuit 
organization  is  possible  and  an  arbitrary  number  of  hierar¬ 
chical  levels  are  supported  by  JHDL. 

Currently,  the  simulation  kernel  is  limited  to  syn¬ 
chronous,  globally  clocked  circuits  but  as  these  are  the  only 
kinds  of  circuits  that  consistently  work  on  CCMs,  this  is 
not  a  serious  limitation  1 .  However,  if  multiple  clocks  are 
necessary,  the  simulation  kernel  can  be  easily  modified  to 
support  an  arbitrary  number  of  clocks.  The  JHDL  simu¬ 
lator  was  designed  to  handle  circuits  that  are  run-time  re¬ 
configured;  at  any  point  in  the  simulation  run  the  clock  can 
be  stopped,  circuit  elements  added  to  or  deleted  from  ex¬ 
isting  circuitry  dynamically,  and  the  simulation  run  contin¬ 
ued  from  where  it  stopped.  This  allows  JHDL  to  simulate 
complete  and  partial  configuration  of  circuits  on  CCMs. 

The  HWSystem  Class 
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Figure  1 :  Top  level  view  of  a  JHDL  design 


in  the  figure  can  either  be  simulated  behaviorally,  as  soft¬ 
ware  buffers  that  are  synchronized  to  a  global  clock,  or 
executed  on  the  CCM,  as  actual  hardware  ports  that  com¬ 
municate  with  a  host  or  other  hardware.  This  is  shown 
in  Figure  2.  Because  of  this  duality,  any  circuit  that  can 
be  described  in  JHDL  and  compiled  to  a  bitstream  can 
be  run  interchangeably  in  the  simulation  kernel  and  on  a 
hardware  platform,  without  any  modification  of  the  source 
code.  This  hardware-software  abstraction  is  discussed  fur¬ 
ther  in  Section  6. 


In  order  to  access  the  JHDL  simulation  kernel  and  CCM 
control  layer,  the  user  circuit  is  encapsulated  in  a  top-level 
class  called  HWSystem ,  as  shown  in  Figure  1 .  The  HWSys¬ 
tem  provides  all  of  the  functionality  for  simulation  and 
CCM  control.  In  addition,  the  HWSystem  provides  a  means 
of  communicating  with  the  external  world  or  CCM  host 
over  special  wires,  called  “ports.”  These  ports  synchro¬ 
nize  the  input  and  output  data  with  the  global  clock  used 
in  JHDL.  The  user  circuit  itself  may  be  composed  of  an  ar¬ 
bitrary  number  of  JHDL  circuit  objects;  only  the  top-level 
object  is  encapsulated  by  the  HWSystem. 

The  HWSystem  is  the  essential  link  between  hardware 
execution  and  software  simulation.  It  implements  a  simu¬ 
lation  kernel  which  invokes  behavioral  descriptions  of  cir¬ 
cuit  objects  during  software  simulation.  The  HWSystem  is 
also  what  implements  the  API  to  the  CCM  device  drivers; 
with  this  interface,  it  can  coordinate  the  computation  on  the 
CCM  by  configuring  the  appropriate  circuits  on  the  CCM, 
loading  data,  etc.  Similarly,  the  InPort  or  OutPort  shown 

1  some  would  argue  that  this  limitation  is  a  strength. 
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Figure  2:  Top  level  view  of  JHDL  system 


Finally,  once  the  circuit  is  described,  the  JHDL  system 
is  designed  to  interface  to  back-end  synthesis  tools  with 


relative  ease.  When  a  JHDL  object  is  constructed,  JHDL 
automatically  maintains  an  internal  circuit  graph  describ¬ 
ing  the  circuit,  including  hierarchy  and  port  types  (includ¬ 
ing  width  and  direction).  This  was  designed  so  that  either 
flat  or  hierarchical  netlists  could  be  easily  generated  from 
the  internal  graph.  In  addition,  this  graph  is  also  used  by 
the  simulator  to  perform  run-time  checks  of  circuits  during 
simulation.  It  is  also  modified  to  be  consistent  as  circuit  el¬ 
ements  are  added  or  deleted  as  when  CCM  devices  are  par¬ 
tially  or  run-time  reconfigured.  As  is  the  case  with  all  CAD 
tools,  a  library  of  circuit  primitives  must  be  provided  but 
once  this  is  available  and  a  netlist  format  determined,  the 
netlist  for  any  structural  circuit  will  be  automatically  gen¬ 
erated  for  any  structural  circuit  design.  This  is  discussed 
further  in  the  future  work  section. 

The  Simulation  Kernel 

The  HWSystem  object  tracks  all  the  wire  and  logic  objects 
that  it  encapsulates.  All  input  and  output  data  to  the  top- 
level  circuit  is  passed  via  the  InPort  and  OutPort  classes, 
respectively.  These  classes  provide  support  for  buffered 
transfers  of  data.  When  each  circuit  object  is  constructed, 
it  registers  itself  on  a  wire  list  or  clock  fanout  list  in  the 
HWSystem.  This  allows  the  system  to  track  all  the  point¬ 
ers  needed  to  perform  simulation.  The  HWSystem  per¬ 
forms  a  circuit  simulation  using  the  following  simple  al¬ 
gorithm  (assuming  that  the  user  has  already  initialized  all 
data  buffers): 

1 .  Cause  each  Inport  to  drive  its  wires  with  the  next  data 
in  its  buffer. 

2.  Issue  a  “clock”  to  all  synchronous  logic  (by  calling 
the  behaviorf )  method  for  every  Synchronous  circuit 
object). 

3.  Propagate  all  wires  that  were  updated  by  the  syn¬ 
chronous  logic. 

4.  Propagate  all  affected  combinational  logic  (by  invok¬ 
ing  the  propagate( )  methods  of  all  CL  objects  in  the 
wires’  fan-outs). 

5.  Repeat  3-4  above  until  the  network  is  stable. 

6.  Cause  each  OutPort  to  write  the  state  of  its  associated 
wire  into  its  buffer. 

7.  Repeat  1  -6  above  until  the  requested  number  of  clocks 
have  been  issued. 

This  algorithm  is  represented  in  Figure  3. 


Figure  3:  Operation  of  the  simulation  kernel 

Hardware  Execution  under  JHDL 

If  the  computation  is  to  be  performed  in  the  FPGA  hard¬ 
ware,  the  computational  steps  change  significantly.  When 
the  HWSystem  is  constructed,  it  calls  the  device  driver  to 
load  in  the  initial  bitstream  to  the  device.  Currently,  the 
user  must  provide  the  HWSystem  with  the  name  of  the 
file  containing  the  configuration  bitstream.  In  the  near  fu¬ 
ture  when  netlisting  capability  is  implemented,  this  will  be 
handled  automatically  by  IHDL.  When  the  HWSystem  re¬ 
ceives  the  request  to  clock  the  circuit,  it  makes  another  de¬ 
vice  driver  call  to  clock  the  device  the  requested  number 
of  times.  The  InPort  and  OutPort  objects  make  their  own 
driver  calls  to  exchange  input/output  values  with  the  de¬ 
vice.  The  hardware  execution  cycle  is  as  follows: 

1.  The  user  passes  input  data  to  the  InPort  buffer.  The 
InPort  sends  the  data  to  the  device  via  a  driver  call, 
and  the  driver  buffers  the  data. 

2.  The  user  requests  a  sequence  of  clocks.  The  HWSys¬ 
tem  makes  a  driver  call  to  issue  the  clocks. 

3.  The  driver  issues  the  clocks  and  buffers  the  data  for 
each  OutPort. 

4.  The  HWSystem  waits  for  the  driver  call  to  complete. 
When  it  does,  it  requests  the  data  for  each  OutPort , 
and  loads  the  data  into  the  appropriate  objects. 

This  algorithm  is  represented  in  Figure  4. 


vice. 


Figure  4:  Operation  of  the  hardware  interface 

Any  hardware  device  can  be  used  that  supports  the  API 
of  the  JHDL  system.  The  driver  is  simply  compiled  to  na¬ 
tive  code,  and  the  driver  calls  are  linked  to  the  IHDL  sys¬ 
tem  as  native  methods.  The  driver  is  loaded  at  run-time  as 
a  shared  library;  changing  the  hardware  platform  is  simply 
a  matter  of  loading  a  different  library  file.  We  implemented 
a  simple  IHDL-compatible  driver  for  the  VCC  HotWorks 
board,  which  is  based  on  a  Xilinx  XC62 16-series  FPGA. 
The  driver  is  a  simple  wrapper  around  a  preexistent  de¬ 
vice  driver  which  had  been  developed  at  BYU,  which  adds 
buffering  capability  and  exports  the  proper  API.  Further 
consideration  is  underway  for  developing  a  similar  driver 
for  other  systems  based  on  Xilinx  XC4000-series  parts. 

7  Modeling  Hardware  Reconfiguration 

In  IHDL,  circuits  are  configured  and  reconfigured  on 
CCMs  via  object  constructors/destructors.  However,  Java 
directly  supports  only  explicit  object  construction;  explicit 
object  deletion  is  not  supported  as  all  memory  is  reclaimed 
automatically  by  a  run-time  garbage  collector.  In  order  to 
support  explicit  object  deletion,  JHDL  provides  a  delete() 
method  for  each  of  the  base  circuit  classes.  When  delete!) 
is  invoked  on  an  object,  say  object  A,  A  and  all  of  its  con¬ 
stituent  objects  and  wires  are  removed  from  the  internal 
netlist  graph  (dereferenced  so  they  can  be  garbage  col¬ 
lected)  and  marked  as  deleted.  During  hardware  execution, 
invoking  delete))  is  a  signal  to  the  CCM  that  the  device  cur¬ 
rently  occupied  by  the  given  circuit  should  be  reclaimed  by 
“blanking”  out  the  appropriate  locations  in  the  proper  de¬ 


JHDL  supports  partial  reconfiguration  similar  to  the  re¬ 
configuration  approach  already  mentioned  through  an  ad¬ 
ditional  interface  class,  the  PRSocket  (Partially  Reconfig- 
urable  Socket).  This  class  is  used  to  describe  parts  of  the 
circuit  that  require  partial  configuration.  It  maintains  the 
multiple  configuration  information  and  automatically  pro¬ 
vides  the  transparent  switching  between  simulation  and 
hardware  execution.  The  mnemonic  implies  that  partial 
reconfiguration  is  modeled  as  a  discrete  chip  socket  that 
allows  any  chip  with  the  right  pinout  to  be  “plugged  in.” 
The  socket  simply  serves  as  a  placeholder  in  the  internal 
JHDL  netlist.  The  user  connects  the  static  logic  in  the  cir¬ 
cuit  to  this  socket,  and  provides  the  behavioral  description 
of  what  goes  on  inside  that  socket  later.  The  PRSocket  will 
allow  any  circuit  with  the  proper  interface  to  be  “plugged 
in”.  The  basic  function  of  the  PRSocket  is  illustrated  in 
Figure  5.  When  the  computation  is  being  performed  in 
the  simulator,  the  PRSocket  simply  dereferences  all  point¬ 
ers  to  the  underlying  JHDL  object  and  creates  an  instance 
of  the  newly  configured  object.  When  the  computation  is 
performed  on  the  hardware  platform,  the  PRSocket  com¬ 
municates  with  the  hardware  drivers  to  load  the  new  partial 
configuration  at  the  appropriate  chip  location. 


Figure  5;  Describing  partial  reconfiguration 

Partial  Reconfiguration:  Software  Simulation 

The  user  needs  to  tell  the  PRSocket  in  advance  which 
configurations  it  will  encapsulate.  This  constraint  helps  the 
PRSocket  develop  a  netlist,  and  is  also  needed  to  facilitate 


hardware  execution.  The  list  of  configurations  is  encap¬ 
sulated  in  an  object  called  a  ConfigGroup  by  defining  a 
function  that  might  look  like  this: 

class  myConfigGroup  extends  ConfigGroup  { 

/*  A  Node  is  the  base  class  for  all  JHDL  logic 
Node  getNewCircuitfint  id.  PRSocket  sock)  { 
switch(id)  { 
case  1: 

return  new  Circuitlf...); 
case  2: 

return  new  Circuit2(...); 
case  3: 

return  new  Circuit3(...); 

}’}} 

This  ConfigGroup  object  is  passed  to  the  PRSocket 
when  it  is  constructed.  Once  the  PRSocket  is  constructed, 
the  user  connects  it  to  the  static  logic  just  like  any  other 
piece  of  logic.  During  circuit  construction  it  serves  as  a 
placeholder  in  the  JHDL  internal  netlist. 

When  the  user  wants  to  reconfigure  the  circuit,  he 
tells  the  PRSocket  to  load  configuration  fin  by  calling 
PRSocket. Reconfigure(n).  The  PRSocket  then  calls  Config¬ 
Group. getNewCircuit(n)  to  get  an  instance  of  the  desired 
circuit  object.  The  new  object  gets  pointers  to  the  static 
interface  wires  from  the  PRSocket ,  and  adds  itself  to  the 
global  netlist  as  usual.  It  can  then  be  controlled  by  the  sim¬ 
ulation  kernel. 

To  complete  the  reconfiguration  process,  the  PRSocket 
must  destroy  the  old  logic  that  it  contained.  This  is  per¬ 
formed  by  invoking  the  delete!)  method  on  the  circuit  to 
be  deleted.  Before  simulating  again,  the  kernel  sweeps 
its  global  object  lists  and  removes  all  references  to  deleted 
objects.  The  partial  reconfiguration  process  is  depicted  in 
Figure  6.  A  more  detailed  look  at  the  how  to  write  JHDL 
code  for  partial  reconfiguration  is  given  in  Section  8. 

7.1  Partial  Reconfiguration:  CCM  Execution 

The  HWSystem  keeps  track  of  all  the  partial  config¬ 
uration  lists  that  are  used  in  its  PRSockets.  When  it  is 
constructed,  it  instructs  the  CCM  device  driver  to  keep  a 
pointer  to  all  partial  configurations,  as  well  as  loading  the 
static  configuration.  When  the  Reconfigure( )  method  of  a 
PRSocket  is  called,  it  references  the  partial  configuration 
by  an  ID  number  and  calls  the  device  driver  to  load  the  ap¬ 
propriate  configuration  into  the  corresponding  part  of  the 
circuit.  This  of  course  requires  a  hardware  platform  that 
can  perform  partial  configuration,  like  the  Xilinx  XC6200 
series.  For  this  project,  the  BYU  device  driver  for  the 
VCC  Hotworks  board  has  been  augmented  to  support  par¬ 
tial  configuration. 


Figure  6:  Partial  reconfiguration  during  simulation 

8  JHDL  Examples 

JHDL  has  already  been  used  to  describe  and  execute 
several  “real”  applications,  such  as  the  “shapesum”  and 
“correlation”  functions  of  the  Chunky-SLD  Automatic  Tar¬ 
get  Recognition  (ATR)  problem  [8],  These  applications 
have  been  implemented  on  the  Xilinx  6200  using  partial 
reconfiguration,  as  well  as  on  other  platforms  [10].  For 
this  initial  feasibility  study,  the  original  circuits  are  being 
used  as  they  were  originally  implemented  via  schematic 
capture  and  manual  placement.  The  main  difference  is 
that  the  entire  circuit  is  now  described  in  JHDL.  This  de¬ 
scription  provides  a  comprehensive  simulation  model  of 
the  ATR  application.  This  is  itself  noteworthy  as  the  ATR 
application  consists  of  several  partial  configurations  that 
are  loaded  into  FPGA  devices  as  new  images  are  corre¬ 
lated  [10].  In  addition,  this  same  JHDL  description  is  used 
to  directly  control  the  VCC  “Hotworks”  CCM,  replacing 
the  original  Tcl/Tk  program  that  was  previously  used  for 
controlling  the  CCM.  Because  netlisting  capability  is  not 
yet  present  in  JHDL,  the  correspondence  between  JHDL 
circuit  object  and  configuration  bitstream  is  managed  man¬ 
ually  by  the  user  through  the  ConfigGroup  as  already  dis¬ 
cussed  2.  Even  in  these  early  stages,  JHDL  has  proven  to 
be  surprisingly  effective  at  describing,  simulating,  and  ex¬ 
ecuting  these  systems,  including  the  partial  reconfiguration 
required  to  change  the  image  template. 

Although  the  ATR  examples  are  currently  operational, 
they  are  far  too  complex  to  serve  as  coding  examples  of 

-In  practice  this  means  that  the  designer  is  responsible  for  generating 
bitstreams  with  some  tool  and  “telling”  JHDL  where  these  files  are  and 
what  circuit  objects  they  are  associated  with.  In  the  near  future  much  of 
this  will  be  completely  automated. 


JHDL  in  this  paper.  As  a  more  accessible  example  of 
JHDL  usage,  consider  the  following  FIR  filter  example 
that  will  be  used  to  demonstrate  the  salient  features  of 
the  JHDL  system.  For  this  example,  assume  that  the  tap 
weights  are  compiled  directly  into  the  circuit  and  partial  re¬ 
configuration  is  used  to  modify  the  weights  at  runtime.  At 
the  top  level,  the  user  would  write  standard  Java  to  provide 
the  user  interface,  gather  input  data,  and  so  forth.  Assum¬ 
ing  that  the  FIR  filter  has  already  been  designed  (the  next 
section  illustrates  the  FIR  design),  it  would  be  “wired”  into 
the  top-level  system  as  shown  below. 

class  myJavaCode  { 

SomeMethodf...)  { 

/*  Create  a  new  system  */ 

HWSystem  system  =  new  HWSystemQ; 

/*  Tell  the  system  how  to  compute  —  this  time ,  we’ll  simulate  */ 
system.  setS  WMode(); 

/*  Create  new  wires  to  pass  to  the  filter,  8  bits  wide  each. 

The  “system”  reference  helps  the  Wire  class  build  the  circuit 
graph  (“system”  is  the  parent )  */ 

Wire  Input  =  new  Wire( system,  8); 

Wire  Output  =  new  Wirefsystem,  8); 

/*  Create  a  new  filter  object;  pass  in  pointers  to  I/O  wires. 

Note  that  this  configures  when  in  hardware  mode.  */ 
FirFilter  filter  =  new  FirFilter) system,  Input,  Output); 

/*  Encapsulate  the  I/O  wires  with  ports.  */ 

InPort  inport  =  system.newInPort(Input); 

OutPort  outport  =  system. newOutPort(Output); 

/*  The  object  is  now  constructed  and  appropriately 
encapsulated.  Now,  reconfigure  the  tap  constants;  this  method 
is  user-defined.  */ 

filter.Reconfigure(GetNewTapConstants()); 

/*  Now,  initialize  the  input  buffer  with  data.  */ 
inport.  writeBuffer(InputData); 

I*  Allocate  a  new  output  buffer,  same  size  as  the  data  array  */ 
outport.newBuffer(InputData.length); 

/*  Now  clock  the  circuit,  and  get  the  results.  */ 

system.Clock(some_number); 

int  Results[]  =  outport.getBuffer(); 

The  user  can  then  take  the  results  and  process  them  as 
needed.  Now,  let’s  look  at  how  the  FIR  filter  circuit  might 
be  described. 

/*  This  is  just  a  structural  circuit  -  all  behavior  is  de¬ 
scribed  in  the  fir  taps  */ 
class  FirFilter  extends  Structural  { 

Wire[]  data_wire_array,  mac_wire_array; 


Wire  fir_zero,  data_input,  data_output,  fir_output; 

PRSocket  FirTaps[]; 
int  tapCount; 

/*  This  manages  the  partial  reconfigurations  for  each  fir  tap  */ 
static  FirConfigGroup  config  =  new  FirConfigGroupO; 

FirFilterfNode  parent.  Wire  in,  Wire  out)  { 

/*  Every  object  that  inherits  from  Node  must  do  this  first  to 

build  the  netlist  graph  */ 

super(parent); 

/*  Now,  declare  my  inputs/outputs,  8  bits  each  */ 
inPortfin,  8);  outPort(out,  8); 

fir_zero  =  new  Wire(this,  8); 

...  (  initialize  other  wires  in  a  similar  manner  ) 

for(i=l;i<TapCount;i++)  { 

FirTaps  =  new  PRSocket(this,  config); 

/*  Now  we  must  declare  the  static  interface  to  each  PRSocket. 
Each  wire  is  assigned  a  ‘  ‘port  number”  for  reference.  */ 
FirTaps[i].inPort(data_wire_array[i-l],  0); 

FirTaps  [i]  .inPort(mac_wire_array  [i- 1  ] .  1 ) ; 
FirTaps[i].outPort(data_wire_array[i],  2); 
FirTaps[i].outPort(mac_wire_array[i],  3); 

} 

} 

/*  The  user  defines  this  to  export  a  top-level  Reconfigure 
method  to  the  outside  world.  However,  the  actual 
reconfiguration  of  each  individual  circuit  is  handled  by  the 
PRSocket.  */ 

public  void  Reconfigurefint  tap_constants[])  { 
for  (i=0;i<TapCount;i++)  { 
FirTaps[i].Reconfigure(tap_constants[i]); 

} 

} 

} 

The  user  of  course  has  to  define  the  FirConfigGroup  so 
that  it  returns  the  kind  of  object  he  wants.  This  could  be 
done  as  follows: 

class  FirConfigGroup  extends  ConfigGroup  { 

/*  Tell  the  superclass  how  many  ports  (4)  and  potential 
configurations  (8  bit  fir  tap  constant  =>  256)  are  allowed.  */ 
public  FirConfigGroup(HWSystem  s)  {  super(s,  4,  256);  } 

/*  Here,  the  reconfiguration  is  completely  described  by 
returning  pointers  to  new  objects  of  the  desired  type.  In  our  case, 
the  id  just  represents  the  tap  constant  and  associated  name  of  the 
file  containing  the  configuration  to  be  loaded.  */ 
public  Node  getPRObject(int  id,  PRSocket  sock)  { 
return  new  FirTap(sock,  id); 

} 

} 


Finally,  the  user  must  describe  the  behavior  of  the  fir 
tap.  The  logical  choice  is  to  make  this  a  Synchronous  ob¬ 
ject.  It  could  be  implemented  as  follows: 

class  FirTap  extends  Synchronous  { 
int  tap_constant; 

Wire  firjnput,  datajnput,  mac  .output,  data.output,  dO.dl; 

public  FirTapCPRSocket  p,  int  constant)  { 
int  tap.constant; 

Wire  dO.dl; 

/*  The  PRSocket  is  a  Node,  so  it  is  the  parent  object.  */ 
super(p); 

tap.constant  =  constant; 

/*  Now,  get  a  pointer  to  all  the  wires  that  interface  to  the  static 

logic.  The  datajnput  wire  was  assigned  port  #0;  etc.  */ 

datajnput  =  p.inConnect(O); 

firjnput  =  p.inConnect(l); 

data.output  =  p.outConnect(2); 

fir.output  =  p.outConnect(3); 

inPortfdataJnput,  8); ... 

} 

/*  Here,  we  describe  the  actual  computation  of  each  tap. 

This  is  executed  by  the  HWSystem  once  per  clock.  */ 

public  void  behavior))  { 

/*  Multiply-accumulate  the  input  value,  and  delay  the  input 
value.  Wire  values  are  read  and  written  using  the  get()  and 
put()  methods,  respectively.  We  pass  a  pointer  to  “this”  with 
every  access,  which  is  used  to  help  enforce  port  directions 
(necessary  for  netlisting).  */ 

fir_output.put(this,  tap.constant  *  dO.dl. get(this) 

+  firjnput.get(this)); 
data.out. put) this,  dO.dl. get) this)); 
dO.dl. put(this,  data.in.get) this)); 

} 

} 


9  Evaluation  and  Conclusions 

We  believe  that  the  constructor/destructor  mechanism 
has  proven  to  be  a  feasible  way  to  control  configuration  on 
a  CCM.  In  addition,  JHDL  met  all  of  the  project  goals  that 
were  defined  at  the  outset  of  this  research  project: 

1 .  JHDL  is  based  on  a  popular  language  and  requires  no 
language  extensions  for  circuit  design. 


2.  The  CCM  control  paradigm  is  CCM  independent, 
adopting  the  object-instance  construction  metaphor 
from  object-oriented  languages.  The  abstraction  will 
work  with  any  standard  CCM  and  work  is  now  un¬ 
der  way  to  interface  JHDL  to  other  CCMs  such  as  the 
Wildforce  system  from  Annapolis  Microsystems. 

3.  JHDL  supports  both  partial  and  global  configuration 
and  demonstration  applications  from  ATR  have  been 
implemented  to  show  this  capability. 


4.  A  JHDL  application  description  serves  as  both  sim¬ 
ulation  and  execution  for  CCM  applications.  No 
code  modifications  are  required  and  switching  be¬ 
tween  software  simulation  and  hardware  execution  on 
the  CCM  requires  the  setting  of  a  single  boolean  vari¬ 
able. 

JHDL  also  provides  additional  benefits  because  it  is 
based  on  a  commonly-used  programming  language  and  as 
such  all  of  the  standard  language  features,  such  as  I/O,  are 
accessible  to  the  designer  throughout  the  design  process. 
Unlike  VHDL  for  example,  it  is  quite  easy  to  perform  ar¬ 
bitrary  I/O  in  JHDL,  both  to  the  console  and  to  files  during 
software  simulation.  Of  course,  some  of  these  features  are 
only  accessible  during  simulation  mode  in  JHDL;  however, 
that  is  when  they  may  be  of  the  most  use.  For  example,  de¬ 
signers  can  easily  insert  print  statements  in  their  code  as  a 
debugging  aid  so  that  the  internal  state  of  the  application 
can  be  ascertained  during  a  simulation  run.  Once  netlist¬ 
ing  is  fully  implemented,  these  I/O  statements  will  just  be 
ignored.  Graphical  User  Interfaces  IGUIs)  can  also  be  eas¬ 
ily  added  to  the  program  without  the  need  for  any  complex 
linking;  they  are  just  part  of  the  JHDL  application  as  the 
complete  Java  API  is  available  to  the  designer  when  writ¬ 
ing  JHDL  applications.  JHDL  has  the  added  advantage 
that  the  GUI  (or  any  other  software  written  and  integrated 
with  the  application)  may  be  run  on  the  host  even  when  the 
circuit  parts  of  the  application  are  executing  on  the  CCM. 
This  is  possible  because  JHDL  allows  the  application  to 
be  divided  into  those  parts  that  will  run  in  software  and 
those  parts  that  will  run  in  hardware.  When  operating  in 
hardware  execution  mode,  only  those  parts  of  the  appli¬ 
cation  that  are  described  using  circuit  classes  will  be  exe¬ 
cuted  on  the  CCM  platform.  All  other  parts  of  the  applica¬ 
tion  remain  on  the  host,  operating  essentially  as  a  separate 
program  that  is  communicating  with  the  user-designed  cir¬ 
cuitry  via  the  CCM  device  driver.  In  this  way,  JHDL  allows 
for  both  software  and  hardware  descriptions  to  not  only  co¬ 
exist  but  also  to  coexecute. 


10  Future  Work 

In  addition  to  continuing  experimentation  with  JHDL 
for  CCM  applications,  two  areas  have  been  identified  for 
further  work  in  JHDL:  netlisting,  to  allow  JHDL  to  func¬ 
tion  as  a  complete  structural  design  tool,  and  behavioral 
synthesis,  to  allow  circuits  to  be  expressed  at  a  higher  level. 
Indeed,  behavioral  compilation  was  one  of  the  first  ideas 
discussed  in  the  early  proposal  phase  of  this  project.  How¬ 
ever,  because  it  was  necessary  to  first  prove  feasibility  of 
the  basic  concept  of  using  object-instance  construction  as 
a  metaphor  for  CCM  control,  further  development  in  this 
area  had  to  be  delayed.  In  addition,  JHDL  is  showing 
significant  potential  as  a  purely  structural  design  tool  and 
there  is  now  interest  at  BYU  in  further  developing  JHDL  in 
this  direction  as  well.  The  need  for  netlisting  has  already 
been  discussed  in  this  paper  and  the  necessary  internal  cir¬ 
cuit  graph  has  already  been  fully  implemented.  What  re¬ 
mains  is  to  select  a  netlist  format  and  develop  sets  of  circuit 
libraries  that  are  based  on  today’s  popular  FPGA  devices. 
This  effort  is  already  underway. 

Behavioral  synthesis  will  be  designed  to  exploit  indus¬ 
try  CAD  tools  such  as  VHDL  synthesis.  The  approach  is 
based  on  a  fourth  circuit  class,  HWProcess  that  is  already 
partially  implemented.  Similar  to  the  way  circuits  are  de¬ 
fined  using  the  other  CL  and  Structural  classes,  the  de¬ 
signer  inherits  from  HWProcess  when  behavioral  synthe¬ 
sis  is  desired.  The  HWProcess  class  provides  an  additional 
method,  waitUntilClock( ),  that  designers  insert  into  their 
behavior()  methods.  It  provides  the  same  basic  behavior 
as  a  wait( )  in  a  VHDL  process.  Designers  will  be  able  to 
express  circuits  behaviorally  with  this  class  using  a  sub¬ 
set  of  Java  statements  and  developing  a  circuit  description 
that  uses  the  common  wait  until  clock  idiom  found  in  most 
VHDL  synthesizable  subsets.  The  subset  of  Java  state¬ 
ments  will  be  limited  to  statements  that  can  be  supported 
by  synthesizable  VHDL  subsets  This  JHDL  code  will  then 
be  translated  to  VHDL  through  a  simple  syntax-directed 
textural  substitution  and  processed  with  VHDL  synthesis 
tools.  The  advantage  of  this  approach  is  that  it  allows  the 
circuit  to  be  simulated  in  JHDL  in  a  natural  context  with 
other  circuits  but  also  provides  a  clear  path  to  behavioral 
synthesis  that  leverages  currently  available  tools. 
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